Semi-supervised logistic discrimination via labeled data and unlabeled data from different sampling distributions
نویسندگان
چکیده
منابع مشابه
Semi-supervised logistic discrimination via regularized Gaussian basis expansions
The problem of constructing classification methods based on both classified and unclassified data sets is considered for analyzing data with complex structures. We introduce a semi-supervised logistic discriminant model with Gaussian basis expansions. Unknown parameters included in the logistic model are estimated by regularization method along with the technique of EM algorithm. For selection ...
متن کاملLearning from Positive and Unlabeled Examples with Different Data Distributions
We study the problem of learning from positive and unlabeled examples. Although several techniques exist for dealing with this problem, they all assume that positive examples in the positive set P and the positive examples in the unlabeled set U are generated from the same distribution. This assumption may be violated in practice. For example, one wants to collect all printer pages from the Web...
متن کاملSemi-Supervised Text Classification Using Positive and Unlabeled Data
Text classification using positive and unlabeled data refers to the problem of building text classifier using positive documents (P) of one class and unlabeled documents (U) of many other classes. U consists of positive and negative documents. Some existing methods for solving the PU-Learning problem are building a classifier in a two-step process. Generally speaking, these existing methods do ...
متن کاملSemi-supervised Learning from Unbalanced Labeled Data - An Improvement
We present a great improvement while performing semi-supervised learning tasks from training data sets when only a small fraction of the data pairs is labeled. In particular, we propose a novel decision strategy based on normalized model outputs. We give the explanation why the normalization step helps. The paper compares performances of two popular semi-supervised approaches (Consistency Metho...
متن کاملEstimate Unlabeled-Data-Distribution for Semi-supervised PU Learning
Traditional supervised classifiers use only labeled data (features/label pairs) as the training set, while the unlabeled data is used as the testing set. In practice, it is often the case that the labeled data is hard to obtain and the unlabeled data contains the instances that belong to the predefined class beyond the labeled data categories. This problem has been widely studied in recent year...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Statistical Analysis and Data Mining: The ASA Data Science Journal
سال: 2013
ISSN: 1932-1864
DOI: 10.1002/sam.11204